TR  5 


inequality  control  and  reliability 


X  TECHNICAL  REPORT 


STATISTICAL  THEORY  AND  METHODS 
FOR  VALIDAIING  RESULTS  OF 
SAMPLING  INSPECTION  BY  ATTRIBUTES 


(INSTALLATIONS  AND  LOGISTICS) 
WASHINGTON  25,  D.  C. 


Best 

Available 

Copy 


OFFICE  OF  THE  ASSISTANT  SECRETARY  OF  DEFENSE 

WASHINGTON  li,  O  C. 


«STAi.LATIONS  ANO  LOOISTiCS 


16  April  1962 


Statistical  Theory  and  Methods  for  Validating  TR-5 

Resvilts  of  Sampling  Inspection  by  Attributes 

Quality  Control  &  Reliability 


This  technical  report  was  prepared  on  beliadf  of  the 
Office  of  the  Assistant  Secretary  of  Defense  (Installations 
and  Logistics)  by  the  Chezoical  Corps,  Department  of  the  Anqy. 
It  was  written  by  Mr.  Henry  Ellner  of  the  Chemical  Corps 
Materiel  Connand,  Hdgewood,  Maryland. 

Ihis  publication  provides  the  statistical  theory  and 
techniques  underlying  the  procedures  and  tables  furnished 
in  DoD  Handbook  m09,  "Statistical  Procedures  for  Deter¬ 
mining  Validity  of  Suppliers'  Attributes  Inspection,"  dated 
6  May  i960. 


i 


mLE  OF  CON'rtN'l'S 


ParaKrapli  Title  Page 

1  Introduction  1 

2  Product  Verification  Inspection  2 

3  Significance  Tests  for  Homogeneity  A 

U  "Exact"  Testa  of  Saiqples  from  Two  Poisson  Series  6 

3  "Approximate"  Tests  for  Poisson  Variates  8 

b  Alternative  "Approximate"  Tests  for  Poisson  Variates  9 

7  Power  Function  of  Teats  for  Poisson  Variates  10 

8  Combination  of  Tests  of  Poisson  Variates  13 

9  Miscellaneous  Desiderata 

a.  Continuing  Tests  18 

b.  Poisson  Approximatioa  to  the  Binomial  Distribution  20 

c.  Small  Inspection  Lots  23 

d.  Estimating  Product  Quality  24 

10  Statistical  Criteria  for  Paired  Attribute  Samplings  26 

11  Concluding  Remarks  29 

APPENDIX 

Table  I  Limits  for  Determining  Dlsciepancies  Between  Supplier's  and 
Consumer's  Paired  Attributes  Sampling  Inspections  (Critical 
Region  at  Level  of  "Approximate"  Poisson  IVo-San^ile  Test) 

Table  lA  Limits  for  Determining  Discrepancies  Between  Supplier's  and 
Consumer Paired  Attributes  Sampling  Inspections  (Critical 
Region  at  57,  Level  of  "Exact"  Poisson  Two-Sample  Test) 

Table  Ih  Limits  for  Determining  Discrepancies  Between  Supplier's  and 
Consuroei's  Paired  Attributes  Sampling  Inspections  (Critical 
Region  at  3'.;,  Level  ot  Alternate  "Approximate"  Poisson  Two- 
Sample  Test) 


STATISTICAL  THEORY  AND  METliODS  FOR  VALIDATING  RESULTS 
OF  SAMPLING  INSPECTION  BY  ATTRIBUTES 


Htnry  Ellticr 

Cheioical  Corps  Materiel  CooiBaiid,  Department  of  the  Army 

Simple  but  robust  statistical  methods  are  described  and  developed 
for  use  in  validating  suppliers'  inspection  records  of  attribute 
sampling  data.  The  methods  arc  essentially  two^sanple  significance  tests 
for  homogeneity  of  discrete  variates  treated  as  continuous,  and  the  com¬ 
bination  of  their  probabilities  to  test  the  hypothesis  of  over-all 
agreement  of  paired  inspection  results.  The  statistical  theory  and 
techniques  presented  in  this  paper  form  the  basis  for  DoD  Handbook  H109, 
"Statistical  Procedures  for  Determining  Validity  of  Suppliers'  Attri¬ 
butes  Inspection."  The  procedures  of  the  handbook  constitute  a  system 
of  product  verification  inspection  vherein  the  consumer's  senpling 
results  establish  the  reliability  of  the  supplier's  acceptance  sampling 
records  and  provide  independent  estimates  of  the  quality  of  product  sub¬ 
mitted  for  acceptance.  The  characteristics  and  application  of  alternate 
tests  are  dirscussed.  Utilisation  of  the  power  function  of  the  signifi¬ 
cance  tests  affords  substantial  reduction  in  the  amount  of  product 
verification  inspection. 

1.  INTRODUCTION.  As  an  extension  of  the  principle  long  recognized 
by  industry  ^ij  that  amount  of  inspection  is  a  function  of  control  of 
product  quality,  the  Department  of  Defense  established  a  policy  in 
April  1954  that  optimum  use  be  made  of  inspection  data  obtained  by 
suppliers  in  determining  acceptability  of  supplies.  This  broad  policy 
was  traplemeiUed  by  prescribing  uniform  procedures  in  military  and 


federal  series  of  specifications  requiring  the  supplier  to  perform 
examinations  and  tests  itemized  in  the  quality  assurance  provisions  and 
to  maintain  records  of  his  inspection  results.  Verification  of  the 
supplier's  compliance  with  technical  requirements  of  the  contract  was 
made  the  responsibility  of  the  Government  representative. 

When  a  system  of  sampling  inspection,  like  MIL-STD  103  (Sampling 
Procedures  and  Tables  tor  Inspection  by  Attributes),  is  required  of  the 
supplier,  there  is  an  incentive  for  him  to  upgrade  and  siaintain  an 
acceptable  quality  level  for  the  product  submitted  for  consumer's  accept¬ 
ance.  The  consumer's  product  inspection  then  can  be  adjusted  to  an  asmunt 
necesaery  to  verify  the  sampling  results  recorded  by  the  supplier. 

To  assist  the  Government  representative  in  establishing  the  validity 
of  the  supplier's  inspection  records,  the  Office  of  the  Assistant 
Secretary  of  Defense  (Supply  end  Logistics)  has  published  Quality  Control 
and  Reliability  Handbook  (Interim)  H109  [^2^  ,  which  provides  procedures 

for  valideting  the  results  of  saispling  inspection  by  attributes  as 
recorded  by  a  supplier.  The  underlying  mathematical  and  statistical 
principles  of  these  procedures  are  included  in  this  paper.  The  basic 
concepts  were  derived  from  scattered  literature  sources  and  adapted  to 
meet  the  exigencies  of  tne  field  Inspector,  For  this  purpose,  simple 
approximate  tests  for  varijces  from  discrete  distributions  were 
investigated  and  developed  to  provide  a  systematic  approach  tor  accom¬ 
plishing  product  verification  inspection. 

2.  PRODUCT  VERIFICATION  JJN^_^’T10N:.  To  make  the  discussion  more 
concrete,  reierence  will  be  taade  to  MIL^'STD  103  ^3j  and  the  terms 


deiliicd  In  that  document  Will  be  used  in  what  tollows.  In  inspection  by 
attributes,  the  unit  of  product  is  rlassified  simply  as  defective  or 
etfective  (nondefective)  with  respect  to  a  (^iveii  requirement  or  a  set  of 
requirements.  The  requirement  may  be  an  individual  checkpoint  and  the 
set  may  be  a  group  of  characteristics  of  equal  importance  listed  under  a 
single  acceptable  quality  level  (AQL)  in  the  specification.  We  shall 
assume  that  even  when  a  measurement  along  a  continuous  numerical  scale 
is  possible,  such  measurement  will  be  classified  as  conforming  or  non- 
conforming  with  the  tolerance  limits  prescribed. 

Let  us  now  suppose  that  a  supplier  has  drawn  a  single  sample  from 
an  inspection  lot  in  accordance  with  MIL-STD  103,  and  has  noted  the 
number  of  conforming  ana  nonconforming  items  in  the  saaiple.  The  consumer 
has  proceeded  likewise  by  selecting  an  equal  or  smaller  sample  from  the 
same  lot  (the  size  of  the  sample  will  be  adjusted  in  subsequent  trials). 
We  shall  assume  that  the  lot  size  is  large  relative  to  the  total  sample 
size  (say,  at  least  8:1)  so  that  the  respective  samples  can  be  con¬ 
sidered  as  independently  drawn  from  a  binomial  population.  When  the 
lot  size  i  relatively  small,  the  condition  is  imposed  that  the  samples 
be  drawn  without  replacement.  The  results  of  the  two  inspections  are 
denoted  symbolically  in  a  2  X  2  table  as  below 

TAHl-E  1 


Notation  for  Two-Sample  Test  for  Homogeneity 

Detective  Effective  Sample  Size 

Supplier's  Sample 
Consumer's  Sample  d^ 

Total  d.  n^  -  d, 


i 


The  sample  sizes  of  the  supplier  and  the  consumer  are  represented 
by  ng  and  nc,  and  their  total  by  n^.  The  number  of  defectives  recorded 
by  tne  supplier  and  consumer  are  symbolized  by  d„  and  d.,  and  their  sum 
by  d|-,  Product  verification  inspection  is  accomplished  by  comparing 
the  proportion  defective  In  the  supplier's  sample  with  the  proportion 
defective  in  the  consumer's  sample.  The  comparison  is  considered  a  test 
of  homogeneity  of  the  two  samples  since  the  concern  is  whether  the 
fraction  defectives  observed  would  be  such  as  would  only  occur  by  chance 
selection  of  the  sample  units,  inspection  being  uniformly  performed.  The 
problem  is  to  set  up  criteria  so  that  discrepancies  arising  by  chance 
alone  are  differentiated  from  those  generated  by  disparities  in  the 
inspection  practice.  Statistically  this  can  be  accomplished  by 
significance  tests  for  homogeneity  of  the  two  sample  results. 

3.  SIGNIFICAUCE  TESTS  FOR  KOHOGENEITY.  A  cooinon  test  of  significance 
for  dichotomized  data  is  the  chi-square  test  ^4^  and  equivalent  alternates. 
When  the  expected  number  of  defectives  is  small,  say  less  than  five, 
Fisher's  exact  test  (  ,  Section  21.02)  is  generally  advised.  For 

routine  testing,  these  techniques  all  involve  extensive  computation,  and 
consequently  are  not  suitable  ioi  veriilcatlon  purposes.  Short  cut 
procedures  ^6,  7,  S,  revised  to  meet  this  problem,  including  nomograms 
and  extensive  c  jlu  lat ;  u'ls  o:  Fijlur's  t.xact  test,  are  likewise  wanting  in 
that  multiple  entries  arc  ntcfs.sary  or  that  tables  required  are  too 
lengthy  and  n'jmerous. 

A  test  for  homogeneity,  appilc.tbli  whet.  Lhc  proportion  ot  defectives 
d^.^n^  is  small,  say  U,2U  vjc  ics'.,  is  .i.i<  viiKh  compares  samples  from 


populations  approximated  by  the  Poisson  type  of  distribution. 

Prxyborowski  &  Wilenski  |loJ  considered  two  observations  (in  our 
notation;  d^  and  d^)  originating  from  cvo  Poisson-distributed  popula¬ 
tions  with  unknown  means,  and  for  Che  synnetrical  case  n^  ~  they 
proposed  an  "exact"  test  for  the  equality  of  these  means.  Barnard  [l^ 
extended  their  method  to  the  case  n^  ^  reducing  the  procedure  to  a 
simple  test  for  the  variance  -  ratio  F.  Bross  and  Hasten  |l2j  derived 
an  equivalent  technique  for  the  case  n^  ^  and  published  charts  for 
avoiding  or  reducing  computations.  Cox  [id]  proposed  a  variance-ratio 
test,  treating  variates  as  continuous,  for  the  equivalence  of  two 
Poisson  processes.  David  and  Johnson  and  Lancaster  [isj  suggested 

a  probability  integral  transformation  when  the  variable  is  discontinuous, 
which  was  further  amplified  by  Lancaster  who  proposed  the  use  of  the 

mid  or  mediar  probability  as  a  test  function  for  discrete  distributions. 
The  apparently  different  tests  for  Poisson  variates  can  be  shown  to  be 
essentially  equivalent  when  the  number  of  events  observed  are  not  Coo 
small.  This  has  been  noted  by  Barton  [l?]  and  is  further  discussed  in 
the  development  which  follows.  It  will  be  shown  that  Cox's  method  has 
certain  properties  which  make  it  preferable  for  use  in  product  verifica¬ 
tion  inspection.  For  accurate  results  in  very  small  samples,  this  method 
Is  to  be  preferred  to  approximate  chi-square  methods.  For  larger  samples, 
when  appropriate  tables  of  critical  values  are  not  available,  the  approxi¬ 
mate  chi-square  methods  will  be  found  to  be  surprisingly  good.  Proba¬ 
bilities  are  obtained  which  correspond  closely  to  those  given  by  the 
respective  test  functions  suggested  by  Cox  and  Lancaster. 


A.  "EXACT"  TESTS  OF  SAMPLES  FROM  TUO  POISSON  SERIES.  Before  the 


features  of  the  test  functions  of  Cox  and  Lancaster  can  be  discussed,  it 
will  be  necessary  to  derive  the  "exact"  test  tor  comparing  two  Poisson- 
distributed  observations.  Suppose  d^  and  d^  of  Table  1  approximately 


follow  independent  Poisson  distributions  so  that: 


(1)  p;.pi)  *  <■(<*,)  •  ' 


-’’I".,  .  ,d 

(P'n,)  •  . 


<Pc"c>  P 


V 


where : 

•  the  expected  proportion  defective  in  the  supplier's  sample  n^  , 
p^  *  the  expected  proportion  detective  in  the  consumer's  sample  n^. 
Under  the  null  hypothesis  pg  -  p^  '*  Po  Equation  (1)  reduces  to: 


(2)  P(d,,d^  I  Pi)  •  - 


-Pi(",  t  "c)  d  d.  d^ 
(Po)  ^  ^ 


which  can  be  rewritten  as: 

O)  I  Pi>  *  ^<"*0  I  <*£)  I  Pi> 

®  _  _ _  I 

ds.'  dj..'  (n,  /  nc)**®  (n,  / 


-Pi^^s  ^  "c) 


^Pi"s  ^  Pi"c)‘^‘' 


But  we  need  the  probability  of  getting  some  pair  of  results  having 
the  same  total  d^  /  d^  *  d^;  and  so  the  relative  probability,  on  the  null 
hypothesis,  of  getting  the  pair  (dg,dj,)  out  of  all  results  with  the  same 
total  d^  is: 

(A)  P(d^  I  dj.)  "  P(de  I  dt)  P(dt  I  p^) 

|P’) 


itr  • 


n. 


d  .'d  ;  \n„  ^  n,, 

s  c  V  s  '  t 


'^s  ^  "c. 


t 


-  1- 

If  we  let  r  -  ®  then: 


(5)  P(<i,  I  d,)  =  y  f  1  ^  “c  f  r  ^ 

We  note  Chat  conditionally  on  d^  is  binoaially  distributed  with 
parameters,  1 


1  /  r 

significance  test.  Accordingly 


and  d^,  which  can  be  used  as  the  basis  for  a 


(6) 


dt  /‘^t'N  ''  “  y 

F(y)  •  1  [  f-S  *I  I  (dc***. 

y  ~  ^c\y J  ^  J  ^  ^ J  1  /  f 


/  1). 


where  1,^  (Pi<l)  !■  the  incoeipIeCe  ^  •  function  representation  of  a  sun  of 
binomial  probabilities. 

If  the  only  admissible  alternative  to  the  null  hypothesis 
Ps  "  Pc  *  Po  Pc  ^  Ps  appropriate  critical  region,  in  Che 

Neyman-Pearson  sense,  for  rejection  of  the  null  hypothesis  is  defined  by 

d,  ^  kl  (<lt»  Q  )  ^  *^2  ^**0  • 

where  Q  is  Che  risk  of  Che  first  kind  of  error  and  where 

(7)  p  (  dc  ^  kj  (d^.  a  )  I  “  Pc )  a  • 


For  Che  "exact"  test  this  may  be  expressed  by: 


(8)  I 


1  /  r 

This  inequality  may  be  written  in  terms  of  Che  probability  distributiun 

function  Pr  f  (F)  of  the  F  distribution  with  (fi,f,)  degrees  of  freedom 

*■  *■ 

since: 

‘’■l*  ‘2 

£  ^  *  X 


K  f  (F)  e  1^  (p,q) 

tj,  12 

where  fj  •  2q,  f2  ■  2p  and  F  ■  q 

(vri)  ^  a  . 


with  the  result  that 


P2d,  i  2,2dc 
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Inequalities  (8)  and  (9)  establish  a  level  of  significance  which 
does  not  exceed  d  ,  The  true  level  of  elgnlllcance  depends  upon  the 
unknown  p^  and  may  in  some  cases  for  small  he  considerably 

less  than  Q  . 

5.  "APPROXIMATE"  TESTS  FOR  POISSON  VAfilATE'S.  In  inverse  binomial 
sampling^  with  d  fixed,  Barnard  ^IbJ  pointed  out  that  when  p'  la  small 
and  n  la  large  the  number  of  sample  Items  drawn  In  sequence  up  to  the 
dCh  event  la  distributed  approximately  as  (2p')'^  X  where  X 
denotes  a  chi-square  variate  with  2d  degreea  of  freedom  and  p'  represents 
the  true  rate.  For  direct  binomial  sampling,  approximated  by  Poisson's 
exponential  binomial  limit.  In  which  the  number  of  events  d  occurring  In 
a  fixed  n  is  observed,  we  have 

(10)  P(1C  id)*  Z  «  (p'n)**  •  P(  1  X  L  C  ”)•  •"'* 

X  ■  d  x;  2p' 

(11)  P(xidM)  =  P(_l  X  id  /  2  *:  n). 

2p  - 

Cox  [13]  suggested  an  approxiauitlon  to  P(x  ^  d)  In  which  d  Is  treated  as 
a  continuous  variate  by  taking  a  quantity  Intermediate  between  (10)  and  (11) 


(12)  P(x>  d)  P(. 


X  2d  ^  I  f  n), 

which  implies  that  probabilities  are  calculated  as  If 

(13)  2p'n  Is  distributed  as  ^ 

When  two  populations  with  proportions  defective  Pg.P^  *re  coin>ared  by 
means  of  samples  0^,0^.  which  exhibit  d,,dj.  detectives,  then,  from  (12)  we 
compute  the  ratio: 

(14) 

7i,  f  1  is;  J  T 


8 


which  is  distributed  approximately  as  F  with  (Zd^  /  1,  2dj.  /  1)  degrees 

of  frecdoGu  Thus,  we  may  test  the  hypothesis  that  P^  *  Pc  *  Po  <g«lnst 

the  alternate  hypothesis  that  >  p,  by  referring 
d^  ^  0.5 
(15)  F  ■  ^  d,  h  0.5 

to  the  F  tables  with  (2dg  /  l|2dg  /  1)  degrees  of  freedon  for  the 
appropriate  CZ  percent  point. 

This  may  be  represented  by 

^2d,  f  l,2dc  /  I  '  ^ 

(17)  I  I  (d  /  0.5,  d  /  0.5)  ^  a  . 

I  /  r 

It  is  now  clear  that  the  "exact"  tests  given  by  (8)  and  (9)  have 
been  modified  slightly  to  yield  the  approximate  tests  of  (16) and  (17). 

The  modification  has  Che  effect  of  making  the  true  level  of  significance 
less  dependent  upon  Che  unknoim  p^  and  Co  approximate  the  nominal  value 
of  Q  when  averaged  over  d^. 

6.  ALTERNATIVE  "APPROXIMATE"  TESTS  FOR  POISSON  VARUTES.  The  median 
probability  defined  by 

(18)  i  |p<<lc  *  P<'*c  M  '‘t)j 

was  considered  by  Lancaster  [i9j  as  a  test  function  for  paired  Poisson 
variates.  In  terms  of  the  incomplete  -  function  the  critical  region 
may  be  expressed  by: 

(19)  ^  il  I  (d  ,d  /  1)  /  I  1  (dc  f  l,d,)l  f  a  • 

1  W  r  *  W  r  J 

A  comparison  of  the  critical  regions  expressed  by  (8),  (17)  and  (19) 
shows  that  for  the  rejection  rule: 

dc  >  1^2  (df  a  ), 


9 


the  critical  values  of  the  medium  probability  test  are  bounded  by  the 
corresponding  values  for  the  other  two  tests. 


( 


Using  the  median  probability  test  as  a  basis  for  comparison, 
Lancaster  [l9]  has  shown  arithmetically  that  the  median  probability 
usually  closely  corresponds  with  the  probability  of  the  uncorrected  chi- 
square  test  for  sets  of  two  counts,  and  X2.  His  Investigation  was 
limited  to  the  simple  form: 

(20)  =  (Xj^  -  /  Xj^  /  X2 

A  similar  test  may  be  developed  for  the  paired  variates  d  and  d  . 

s  c 

of  Table  1,  which  we  shall  suppose  to  be  Poisson  distributed.  Accord¬ 
ingly,  for  a  one-sided  test  we  have: 

'  (''c  -  -r^  )  ^  ^  (-tVt-  •  '•)  ' 


1  /  r 


TTr 


where,  as  before,  r  ®  and  d^  ■  dj.  dg 

(22)  ■  (r  dc  -  d,)2  /r  dj.  ,  or 

(23)  X  =  r  c'v  -  dg 


Equation  (.^1)  roducts  to: 


where  X  Is  approximately  normally  distributed.  When  d^  la  not  too  small 
and  r  Is  not  much  larger  than  one,  the  ^  approximation  yields  probabilities 
which  correspond  closely  to  the  median  probabilities.  As  d^  Increases 
r  may  Increase  without  essentially  disturbing  the  correspondence  of  the 
probabilities  of  the  two  alternate  "approximate"  tests. 


7.  POWER  FUNCTION  OF  TESTS  FOR  POISSON  V/KIATES,  The  Neyman-Pearson 
theory  of  tests  considers  all  tests  of  the  same  size  and  lays  down  objec- 
Llve  standards  for  selecting  the  best  test.  The  theory  introduces  the 


I 


10 


term,  "power  of  a  test,"  relative  to  the  alternate  hypotaeais,  to  denote 
the  probability  of  correctly  rejecting  the  null  hypothesla  when  an  alter¬ 
native  1b  true.  Of  all  tests  at  a  given  slgnlllcance  level,  the  most 
preferred  Is  the  one  which  has  the  maximum  power  relative  to  all  the 
alternate  hypotheses  considered.  The  probability  of  rejecting  the  null 
hypothesis  regarded  as  a  function  of  H',  where  H'  is  any  of  the  admis¬ 
sible  alternates  to  H^,  Is  called  the  power  functlon'of  the  test.  If  we 
caomence  with  the  determination  of  the  critical  region  subject  to  (7)  we 
can  calculate  the  power  function  of  a  given  test  of  significance.  Thus, 
for  the  "exact"  test  all  points  satisfying  (8)  are  entered  In  (1)  and  the 
absolute  probabilities  are  sunsaed.  Similarly,  for  the  "approximate"  tect 
all  points  satisfying  (17)  are  entered  in  (1)  for  addition  of  the  absolute 
probabilities.  Tables  2  and  3  provide  the  actual  probabilities  associated 
with  Che  respective  one-sided  tests  of  Che  null  hypothesis  p^  ■  p^  against 
the  alternatives  “  2  p^,  ^  Pi  Pc  *  ^*5  P^  r  »  I,  2,  3,  5 

and  8,  respectively,  over  a  range  of  nuisance  parameters  p^n^,  which  may 
be  encountered  in  practice.  A  similar  Cable  can  be  coeluted  for  all 
critical  values  of  the  alternate  "approximate"  test  satisfying  (19).  The 
size  of  this  test  and  Its  power  function  are  somewhat  less  than  those  of 
the  "approximate"  test  (17). 

The  arrangement  of  Table  2  and  Table  3  clearly  reveals  that  the  true 

significance  level  Is  a  function  of  the  expected  number  of  defectives  In 

the  supplier's  sample  and  the  ratio  of  the  supplier's  sample  size  to  the 

size  of  the  consumer's  validation  sample.  For  the  "exact"  test,  under  the 

null  hypothesis,  '  ■  Pg,  the  size  ot  the  test  Increases  on  the  average  by 

a  factor  of  ten  as  p'n'  Increases  from  0.73  to  12.00.  In  contrast,  for 

'^3  s 

ilu  "appi  oximate"  Lest,  Qj  increases  about  1,5  times  over  the  same  range 


of  expected  number  of  defects.  Furtnc'<  uioie,  the  mean  level  of  significance 

I  s 

of  the  entries  sunsned  over  the  five  tabjlar  valuer*  of  r  lor  the  "exact"  and 
"approximate"  tests  are  0.017  and  0.031,  respectively.  The  conclusion  is 
that  the  "approximate"  teat  more  effectively  couLrols  the  sire  of  Che  test 
at  the  significance  level  of  O.OS  than  the  "exact"  test. 

Since  we  can  generally  eatlmai.e  p'n  from  the  supplier's  record  of 
inspection  results  and  the  AQL  under  which  he  is  operating,  we  can  select 
the  power  of  test  by  adjusting  the  sample  size  ratio  r  cotunensurate  with 
the  relative  fraction  defective,  P^/Pg$  which  should  be  detected  If  It 
exists.  This  power  can  be  further  augmented  by  simple  pooling  of  inspec¬ 
tion  results  for  a  given  r  until  the  expected  number  of  defectives  for  the 
supplier's  aanplea  exceeds  the  desired  values  of  Bimbaua  ^2oj 

has  considered  various  methods  of  comparing  two  Poisson  processes  in  terms 
of  Che  ratio  of  their  parameters,  and  suggests  for  fixed  samples  an  I 

accvsnulation  of  observations  until  the  total  number  of  defectives  d^  is 
sufficient  to  yield  the  power  of  test  desired. 

The  values  of  p'n  and  r  in  Table  3  were  selected  so  that  correspond- 
ing  power  function  curves  could  be  obtained  for  a  set  of  alternate 
hypotheses.  Thus,  Che  probabilities  of  rejection  associated  with 
p^/pg  ■  1,  2,  3  and  A.  5,  respectively,  and  subject  Co  the  parameters 
p'n  •  1,50  and  r  ■  1,  approximate  the  rtjeccloii  rates  tabulated  tor  Che 
following  row  and  columnar  headings  of  Table  3: 

p'n  -  2,25  and  r  ■  2,  p'n.  ■  3.00  and  r  =  3; 

s  s  s  ® 

Ps^B  “  r  ■  5;  and,  p^n^  =  6.00  and  r  ■  b. 

Parallel  patterns  run  diagonally  from  upper  left  to  leader  right.  This 

tendency  can  also  be  discerned  In  Tabli.  2  tor  Che  larger  values  of  ^ 
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Power  of  "Exact"  Teat  of  HOB»genelty  of 
Paired  Attribute  Sanplas  Proportional  In  Slee 
(one-tall  teat,  Q  =  O.OS) 


Power  of  "Approximate"  Test  of  Homogeneity  of 
Paired  Attribute  Sjunples  Proportional  in  Site 


Curiously,  the  value  of  the  power  lunctlon  depends  essentially  upon  the 
position  of  r  In  the  Fibonacci  sequence  and  the  position  of  the  nuisance 
parameter  in  one  of  two  interpenetrating  geometric  series  whose  first 

terms  differ  by  0.373,  Another  way  of  viewing  the  sequence  of  p'n  values 

Is  as  follows:  9^3,  9  ^  3,  9  /  3  and  9  /  3. 

8  8  4  4  2  2 

8.  COMBINATION  OF  TESTS  OF  POISSON  VARUTES.  When  the  sao^le  size 
ratio  r  is  varied  from  trial  to  trial  or  the  classification  of  a  defect  is 
altered  with  each  examination,  pooling  of  sampling  results  is  inappropriate 
for  application  of  testa  represented  by  (8),  (17),  (19),  and  (23).  What  is 
needed  is  an  omnibus  type  test  to  combine  all  of  the  evidence  obtained  to 
provide  a  single  measure  of  confidence  in  the  supplier's  inspection 
results. 

From  the  risks  associated  with  the  "exact"  and  "approximate"  tests 
under  the  null  hypothesis  we  can  expect  a  certain  frequency  of  significant 
differences.  Further,  from  the  j3  risks  associated  with  these  tests  we 
can  expect  a  certain  frequency  of  erroneous  acceptances  of  false 
hypotheses.  Accordingly,  it  is  not  correct  to  reject  or  accept  the 
general  hypothesis  that  the  supplier's  inspection  data  are  as  a  whole 
unreliable  as  a  consequence  of  the  individual  lot  comparisons,  which 
taken  separately  appear  to  yield  cither  significant  or  non-significant 
results.  The  over-all  test  calls,  therefore,  for  the  combination  of  a 
number  of  independent  tests  of  significance.  Fisher  (  [b]  ,  Section 

21.1)  has  given  a  general  method  for  combining  the  probabilities  of 
several  mutually  independent  tests.  A  number  of  other  writers  have 
discussed  and  illustrated  this  problem,  but  A.  Birnbaum  ^2lj  has  shown 
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that  Fisher's  method  Is  to  be  preferred  for  its  somewhat  more  uniform 
sensitivity  to  the  alternatives  of  interest. 

The  over-all  test  developed  by  Fisher  deals  with  continuous  variables. 
It  will  yield  biased  results  if  applied  directly  to  probabilities  derived 
from  the  "exact"  test  for  Poisson  variates.  Lancaster  ^13j  ,  David  and 
Johnson  ^14j  ,  Tocher  ^22j  and  Pearson  ^23jand  Yates  ^24,  23j  have  con¬ 
sidered  the  difficulties  encountered  by  the  combination  of  tests  based  on 
discontinuous  variates.  Since  the  "approximate"  test  and  its  alternates 
treat  the  number  of  events,  dg,  d^  as  continuous  variates  the  probabilities 
obtained  can  be  handled  on  a  practical  basis  by  application  of  Fisher's 
probability  Integral,  which  may  be  defined  generally  as  follows; 

Let  p(x)  be  the  probability  density  function  of  a  continaout;  random 
variable  x  in  the  interval  a  ^  ^  ^  where  p(x)  •  0  for  x  ^  a  or 

x>  b. 

Then  if 


(24)  P 


=  / 


p(x)dx. 


P  is  uniformly  distributed  In  the  Interval  (0,1)  and  x  =  -2  log^  P  is 

2 

distributed  as  X  with  2  degrees  of  freedom. 

If  now  we  combine  k  Independent  probabilities,  the  combined 
probability  is  the  product  of  the  k  separate  probabilities,  or 


(23) 


2!  (Zj)  *  -2  logg  (PiP2--*V 


-2 


I  ^°®e 


1  ■  1 


2 

and  so  has  the  ^  distribution  with  2  k  degrees  of  freedom.  Thus,  by 

means  of  the  probability  integral  transformation,  any  number  of 

2 

probabilities  PpP2.---*Pk  converted  to  a  X  value  and,  using  the 

14 


2 

properties  of  the  ^  distribution,  may  be  SLOined  together  with  the 
degrees  of  freedom  to  yield  from  published  tables  an  over-all  probability. 
The  application  of  these  results  to  continuous  population  is  straight¬ 
forward. 

For  discrete  populations,  such  as  the  binomial  represented  by  (5), 
the  over-all  probability  is  biased  when  the  null  hypothesis  is  true.  The 
expectation  of  ^  for  discontinuous  variates  is  always  below  the  theoret¬ 
ical  value  of  2.  Thus,  for  the  case  d^  /  =  4  and  r  ■  1,  we  obtain,  under 

the  null  hypothesis,  the  binomial  ^  and  find  from  Table  4  below  for 
a  one-sided  comparison  that  the  expectation  of  -2  log^  is  1.241  and  the 
variance  of  the  distribution  is  3.527. 

TABLE  4 


Distribution  of  Probability  Integral  Transformation  Applied  to 
"Exact"  Test  for  Case  of  Binomial 
(one-sided  comparison) 


No.  of 

ds 

Events 

dc 

Relative 
Frequency 
of  dg.d^ 

Cumulative 

Probability 

Pi 

Probability 

Integral 

Transformation 

Z£  *  -2  logg  P^ 

4 

0 

0.0625 

1.0000 

0 

3 

1 

0.2500 

0.9375 

0.1291 

2 

2 

0.3750 

0.6875 

0.7494 

1 

3 

0.2500 

0.3125 

2.3263 

0 

4 

0.0625 

0. 0625 

5.5452 

NOTE: 

Expectation 

Variance 

X 

^f  ■  2 

(theoretical) 

2.000 

4.000 

=  -2  logg  1.241  3.527 

Similarly,  for  the  case  of  the  binomial  (1/3  ^  2/3)^,  which  can  be 

2 

derived  from  (5)  the  expectation  of  ^  is  1.314  and  the  variance-  of  the 
distribution  is  2.482.  There  is  clearly  considerable  bias  when  the  proba¬ 
bility  integral  transformation  is  applied  to  the  probabilities  derived* from 
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the  V'xjct"  test.  In  contrast,  Table  5  below  indlcatei  comparative  lack  of 
bias  in  the  "approximate"  test  (17)  when  we  wish  to  coeibine  its  results  for 
a  series  of  Independent  determinations  to  verify  a  coonon  hypothesis:  that 
the  supplier's  Inspection  records  are  reliable.  For  the  binomial  distribu¬ 
tion  just  discussed,  where  dj.  ■  5  and  p  ■  1/3,  Trble  5  indicates  that  for 
Cox's  "approximate"  method  the  expectation  is  2.042  and  the  variance  of 
the  blnomially-distributed  probability  integral  transformation  is  4.393. 
Even  for  an  extremely  small  number  of  observed  defects  the  continuity 
correction  of  the  "approximate"  teat  is  very  effective. 

TABLE  5 

Expectances  and  Variances  of  Binomially-Distributed 
Probability  Integral  Transformations  Derived  from 
"Approximate"  Tests  of  Poisson  Variates 
(one-sided  comparison) 


V 

„\ 

1 

— 

=  1/2 

1 

1 

s  1/4 

1 

1 

1  /  r 

m 

■H 

1  /  r 

1  / 

r  »  1/6 

1  / 

r  ■  1/9 

E(zi)  Var(zi) 

E(zi) 

Var(zi) 

E(ri)  Var(zi) 

E(ri) 

< 

N 

E(zi) 

Var(zi) 

5 

2.045 

4.364 

2,042 

4.393 

2.044 

4.392 

2.056 

4.409 

2.084 

4.391 

4 

2.050 

4.316 

2.051 

4.253 

2.056 

4.485 

2.072 

4.485 

2.  Ill 

4.453 

3 

2.045 

4, 108 

2.067 

4.540 

2.074 

4.604 

2. 101 

4.585 

2.  159 

4.431 

2 

2.024 

3.540 

2.086 

4.463 

2. 106 

4.686 

2. 158 

4.678 

2.257 

4.458 

1 

1.905 

2.259 

2.084 

3.630 

2. 170 

4. 165 

2.302 

4.377 

2.474 

4.  146 

NOTES:  (a)  =  -2  log^  I  ^  ^  ^  (d^  /  0.5,  d,  /  0.5). 


(b) 


(c) 


Conditionally  on  d^, 

I 

parameters,  i  ^  f 


^  <X  >f  .  2  • 


2.000 


..2 


h  -  2  -  ^ 


.000 


is  binomially  distributed  with 
and  d^. 


I 
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TABLE  6 


Expectances  and  Variances  of  Binomial ly-dist rlbuted 
Probability  Integral  Transformations  Derived  from 
Median-probability  Tests  of  Poisson  Variates 
(one-sided  comparison) 


p 

1 

"i . 

1 

1 

"  -I 

\ 

n  \ 

1  /  r  •  1/2 

1  / 

r  ■  1/3 

1  / 

r  •  1/4 

1  /  r  -  l/b 

1  / 

r  ■  1/9 

EB 

E(z^)  Var(zj^) 

E(Zi) 

Var(z^) 

E(Zi) 

Var(z£) 

E(z£)  Var(Zi) 

E(Zi) 

Var(z£) 

5 

1.940  3.461 

1.930 

3.444 

1.927 

3.417 

1.910  3.332 

1.889 

3.240 

4 

1.936  3.  382 

1.917 

3.405 

1.913 

3.333 

1.895  3.  253 

1.868 

3. 144 

3 

1.940  3.  225 

1.913 

3.  302 

1.897 

3.242 

1.871  3.125 

1.841 

2.98: 

2 

1.953  2.858 

1.911 

3.  154 

1.882 

3. 142 

1.841  3.009 

1.  796 

2.821 

1 

1.981  1.975 

1.940 

2.553 

1.898 

2.754 

1.829  2.814 

_ 

1.  754 

2.693 

NOTES:  (a)  =  -log^  |l  ^  ^  ^  (d^.d,  /  1)  /  I  ^  (d^  /  l,dg)j  . 

(b)  Conditionally  on  d^.,  is  binomially  distributed  with 

I  .  . 

parameters,  i  /  '*t* 

(c)  E  (Xhj  ,  2  ■ 

•  2  * 

2 

Similarly,  for  the  median  probability  teat  denoted  by  (19),  the  ^ 
expectations  and  the  variances  of  the  binomially  distributed  probability 
integral  transformation  (23)  approach  the  theoretical  values  of  2.000  and 
4.000,  respectively.  Table  6  discloses  that  for  the  case  d^  *  5  and  p  ■  1/3, 
the  expectation  and  variance  are,  respectively,  1.930  and  3.444.  A  compari¬ 
son  of  Table  5  and  Table  6  reveals  that  (if  dj.  occurrences  are  tew  and  the 
&.?3nple  size  ratio  r  is  large)  the  median  probability  test  tends  to  be 
negatively  biased  with  variances  less  than  the  theoretical  and  the  approxi¬ 
mate  test  of  Cox  positively  biased  with  variances  greater  than  the 
theoretical.  In  product  verification  Inspection,  where  the  probabilities 
of  each  trial  are  combined,  the  Cox  test  maintainj  its  power  to  detect 
Inspection  discrepancies  with  a  small  validation  sample  and  guard  aj^ainst 


17 


insidious  differences,.  On  the  other  hand,  the  median  probability  naChod 
ot  Lancaster  provides  a  raort;  conservative  approach  before  Caking  serious  ^ 

action.  When  an  assignable  cause  can  be  readily  established  for  a 
statistically  si^^niiicant  discrepancy,  the  Cox  test  is  the  method  oi 
choice. 

An  alternate  cicthod  of  combining  probabilities  of  Poisson  variates 
2 

is  to  combine  Che  X  values  of  tests  represented  by  (20),  or  more 

2 

generally  by  (22),  according  to  the  addition  theorem  for  the  ^  distri- 

2 

bution.  The  sum  of  k  values  of  ^  ,  each  with  1  degree  of  freedom,  is 

2 

distributed  as  ^  with  k  degrees  of  freedom.  Tills  test  lacks  power  in 

detecting  a  difference  that  is  consistently  one-sided.  An  alternative 

is  Co  compute  Che  k  values  of  equation  (23)  and  add  them,  taking  account 

of  the  signs  of  differences.  As  X  is  approximately  normally  distributed 

with  unit  standard  deviation  and  zero  mean,  the  sum  ot  k  X-values  is  i  f 

approximately  normally  distributed  with  zero  mean  and  standard  deviation 

equal  to  k^.  The  test  function  is  the  normal  deviate 

(26)  Jx/k^  . 

Equations  (22)  and  (26)  were  applied  by  the  author  to  a  set  of 
2X2  tables  treated  by  Yates  [25]  ,  The  total  sampl'-  sizes  ot  the 
individual  trials  differed  by  a  factor  of  ten  and  the  ratio  of  sample 
sizes  within  a  trial  varied  from  0,55  to  1.98,  The  relative  incidence 
of  events  were  all  in  the  range  1,8  -  9,57.,  The  significance  levels 
obtained  by  Yates,  by  using  conventional  methods  tor  combining  proba¬ 
bilities,  were  all  in  close  agreement  with  the  value  derived  from  (26). 

9.  MISCELLANEOUS  DESIDERATA.  a.  Continuing  Tests  -  Aroian  and 

Levene  [26]  have  pointed  out  that  in  the  classical  theory  ot  testing 

^  •’  ( 
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hypocheset  a  daciilon  is  Bade  after  a  aingle  trial,  with  the  conaequence 
that  no  further  obaervationc  need  b«^  made.  Cumulation  of  test  results 
in  sequential  analysis  or  in  combination  of  teat  probabilities  also  lead 
to  a  termination  of  observations.  In  product  verification  inspection,  as 
well  as  in  quality  control  work,  observations  are  taken  in  sequence.  At 
regular  intervals  a  decision  is  made  to  follow  a  certain  course  of  action. 
This  type  of  test  is  called  a  continuing  test  since  the  observations  are 
continued  indefinitely. 

If  we  suppose  that  in  product  verification  inspection  the  supplier's 
sampling  results  are  in  accordance  with  the  consumer's,  there  will  be  a 
probability  of  Q  each  decision  point  of  finding  a  significant 
difference  and  action  will  ba  taken  unnecessarily  once  in  every  1/q 
decision  points  in  the  long  run.  Now,  if  divergent  inspection  results 
should  suddenly  appear  due  to  real  differences  in  inspection  practice 
there  is  a  probability  y  of  taking  action  at  each  decision  point  (assuming 
that  p'/p'  is  constant  from  trial  to  trial)  until  remedial  action  has  been 
taken.  Then  the  decision  to  take  action  will  be  made  at  say  the  K^h 
decision  point.  The  probabil.ity  that  K  ■  is  the  probability  that  we 
fail  to  take  action  at  the  first  -  1  points  and  do  take  it  at  the 

(27)  P  (K  =  K^)  =  (1  -  y  )'‘o  *  ‘  y 
Similarly, 

(28)  P  (K  £  K^)  ■  I  -  (1  -  X  )’^  ■  € 

If  we  think  in  terms  of  continuing  tests,  we  realize  that  if  we  do 
not  take  action  at  the  first  decision  point  after  a  real  discrepancy 

arises,  we  can  still  do  so  at  the  second,  third,  etc.  If  we  fix 
P  (K  ^  Kq)  and  we  can  choose  y  from  (28).  Then  we  can  state  that 
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departure  from  inspection  accordance  can  be  detected  by  the  decision 
point  with  a  probabili’y  of  £  . 

b.  Poisson  Approximation  to  the  Binomial  Distribution  - 

Many  textbooks  in  probability  have  stated  that  if  p*  — 0  and 

n - oo  so  that  np' - -  X  where  X  is  fixed  and  0  ^  oo  ,  the 

binomial  distribution  converges  to  the  Poisson  distribution  with 
expectation  X  •  If  we  let  p^  denote  the  success  probability  of  the  i^^ 
trial,  i  ~  1,  2,...,n,  then  the  number  of  events  which  occur  have  the 
distribution  sometimes  called  "Poisson  binomial."  Hodges  and  Le  Cam  ^27j 
states  that  R.  von  Mise  pointed  out  that  the  latter  distribution  has  in 

the  limit  a  Poisson  distribution,  provided  that  n - -  oo  and  the  p^  vary 

with  n  in  such  a  way  that^]  Pi  “  X  B  *  |pi»P2* •  •  »Pn| 

tends  to  0.  The  limit  theorem  of  von  Mise  suggests  that  the  Poisson 
approximation  will  be  reliable  provided  n  is  large,  ^  is  small,  and  X 
is  moderate.  Hodges  and  Le  Cam  [27]  showed  that  these  requirements  are 
unnecessarily  restrictive,  and  that  the  Poisson  approximation  will  be 
good  provided  only  Q  is  small,  whether  n  is  small  or  lart^e,  and  whatever 
value  ^  p^  may  have.  Furthermore,  they  state  that  the  number  of  events 
will  have  approximately  a  Poisson  distiibution  even  if  a  few  of  the  p^ 
are  quite  large,  provided  these  values  contribute  only  a  small  part  of 
the  total  ^  Pj^. 

The  note  of  Hodges  and  Le  Cam  implies  that  pooling  of  approximate 
Poisson  variates  for  an  individual  test  of  homogeneity  of  the  sums  is 
an  appropriate  procedure  even  though  p^^  values  vary  from  trial  to  trial 
and  are  not  all  small.  Further,  their  work  supports  the  assumption  that 
a  test  based  on  the  Poisson  distribution  for  comparing  the  number  of 
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occurrence*  in  Cvo  saE^plei  1*  *pplic*bl«  where  Che  probeblllcy  of  Ch* 
attribute  charactcriatlc  1*  stiall  with  respect  to  each  sample.  Bros*  and 
Kasten  [12^  have  examined  the  practical  question  es  to  how  small  the 
proportion  must  be  In  general  to  yield  satisfactory  results  with  the 
"exact"  test.  They  concluded  that  the  value  of  20*4  'Siould  seem  to  servs 
as  a  rough  guide  for  two-tailed  tests  at  the  3X  level/'  as  compared  with 
the  usual  chi-square  test  with  Yates'  correction. 

Table  7  Illustrates  another  method  for  investigating  this  question, 
ilnalogous  binomial  and  Poisson  cases,  where  the  incidence  of  defectives 
was  relatively  high  were  used  for  cocsparlson.  The  power  of  Cox's 
"approximate"  Poisson  two-sample  teat  at  the  5X  level  was  computed  for 
the  cases  selected  and  tabulated  In  Colxmn  (9)  as  shown.,  The 
analogous  Poisson  esse  always  yields  the  upper  bound  to  the  power  of 
the  comparable  blnoislal  cases,  but  the  difference  In  power  is  of  no 
practical  liqportance. 
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Power  of  Blnoalal  Two-sanple  Test  Using  Critical  Region 
at  Level  of  "Approximate"  Poisson  TVo-sanpIe  Test 


/•N  «» 

cu 


'^rO^aOlTtC'lOQQ  9«er>vO<£'OiA<^OQ 

-^OOoor^O^^OO  aor^QO«^c>tn^<n 

rti^r^c*>«nr»oo 

ooooooooo  ooooodooodod 


>»  <M 

m  in 

o  o  o 


<n  (7\  O  OO  (7i 

op  op  n  lO  <n  in 

'T  >»  O  O  CM  pit  fM 

•  •••••• 

o  o  o  o  o  o  o 


00 


u 

e 


•u 
a 


/-s  u 

lo  a 


tnfOtntntntntntnt*^ 


aoinsmsmstn 


♦  OmemmSOinSinm 

t-i  r<4  ,-1  ^  P4  ^ 


m 

a 


in 


a 


Min  N  «n  o»  inoo  min  mm  mm 

♦  c'i*Jsws*-ieo 


•  • 

o  o 


* 

o 


o 


o 


mm  m 

m 

m  iA 

m 

cn 

F-i  0  a  fm 

0 

«  ^  «  «a4  «  CM 

fM 

•  •  • 

• 

•  •  s 

• 

00  0 

0 

000 

0 

*  o 


rnsosmsm  somsOQSomaoo 

t-4  CS  •iMC*4  fHW  »HCn 


m  -  •  I 
w  O. 


m 

m 

o  o  a 

•  • 

o  o 


m 

o  « 

o 


m 

O  « 


tn 

o 

• 

o 


m  m 

♦  o  ♦ 

•  • 

o  o 


as* 

•  • 

o  o 


•  • 

o  o 


m  m 
o 

•  • 

o  o 


a  in  S  m  'll 


m  m 

a  f4  a  F^  a 
•  • 

o  o 


m  o 

w  m 


m 
<71  >7 
O  O 

1  » 

o  o 


u 

F-\  C 
CM  -  O 
O. 


c 

-  a 

O. 


m  m 

mmm<'i*^‘nmmm 

cM<M<MFMF^r^r>.>»>7 

fMCMCMi-<F-lOOOO 


m  mmmmmmmm 

r*.f^r^r«*.r^r'>r^psir». 


o  o  o  m  m  m 

m  m  m  CM  CM  CM 


000 
m  m  m 


m  m  m 
p*.  r»  r>. 


'^'S''#CMCMCM.HfMfMOOO 


000000000000 

mmmmmmmmmmmm 


m  m  m  m 

r*.  CM  CM 

•  •  •  • 

m  m  (M  CM 


m  m  m 
<n  m  cn 


QOOOOOOOO  fM  F^F^F^F^F^t-iF^F^F^FM 


m  I/"!  m  m  m  m  m 

CM  CM  CM  CM  CM  CM  CM 
•  •••••• 

CM  CM  CM  CM  CM  CM  CM 


U 

a 

> 

<71 


CM 


O 

u 

■V 

5 


^  i/^  4/^ 


o 

o 


SI 

> 

•fC  O 

00  u 


m  cn  fh  F^  cn  cn  cn 


a  m  a  m  a  m  o 


Vi 

W 

g 

a 


>*4  X 

O  e 

C  a 

O  u 

•f4  (J 
«J  VI 

a 

li 

u 

e 
*  o 

a  iM 
«  VI 

a  o 

«  ^ 

o 


e  a 

•H 

TJ 
C  J3 


a# 

I 


Vi  V 

c 

o  c 
a  o 
a 

Vi  a 
a  -fi 

I- 

u  a 
3 

Vi  O 
O  00 
(vi  O 

F-C 

.2 

V  a 

fH 

m  « 
a  j:: 

CJ  u 

•fI 

fM  C 
O.  O 

a  'O 
« 

VI  a 

o  a 

z  x> 


I 


i 


22 


c.  Snail  Impaction  Loti  •  Referring  to  Table  1,  vith  lot  aisea 
aaaumad  to  be  infinite,  the  paver  of  the  "approxiaute"  teat  for  bincaaial 


caaea  ahovn  in  Table  7  waa  computed  from  the  product  of  two  independent 
binomiala  giving  P  (d,,d|.  |  Pa»Pci°a*''c)  *4^x1  to 

("a-da) 


(29) 


•a* 


V(''a  -  da)- 


<la 


n 


c* 


I  V(«c  -  ‘*a)' 


dg  (n(.-dg)| 

Pc  ‘Ic  ' 


Entering  in  (29)  the  critical  values  of  the  "approxlaute"  teat  for 


Poiaaon  variatea  (17)  for  Ot  *  0.03,  and  conaldering  pointa  which  deviate 
more  from  the  null  hypotheaia,  abaolute  probabilitiea  were  aiaaaed  to  give 
the  true  power  of  the  teat  under  the  aaauoption  of  Infinite  lot  aisea. 

The  queation  ia  whether  any  further  loaa  of  power  occura  if  aamplea  ara 
drawn  from  finite  lota. 

A  complication  cuauea  whan  aai^>lea  are  drawn  from  finite  lota  that 
are  not  Independent.  Thua  in  product  verification  aaapling,  where  inspec* 
tion  lota  may  on  occaaiona  be  small,  the  number  of  defectivea  removed  by 
the  first  sample  affects  the  probability  of  a  defective  in  the  validation 
sample.  Moreover,  the  inspection  practice  is  not  to  return  defectivea 
found  in  the  sample  to  the  lot  offered  for  acceptance. 

If  two  successive  sanpLea,  n^,  and  n^  are  drawn  without  replacement 
from  a  finite  lot  of  else  N  characterized  Initially  by  p,  by  the  supp^^jer 
and  pg  by  the  consumer,  the  discrepancy  being  merely  in  the  count  of  the 
number  of  defectivea  in  the  lot,  the  probability  getting  the  result  d^,d^  is 
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This  ex])re88loii  can  be  used  for  fixed  N,n^,n^  to  compute  the  actual 
probabilities  ior  a  test  of  homogeneity  applied  to  a  finite  lot.  One 
case  was  computed:  N  =  160,  ■  30,  ■  15,  p,  ■  0.05  and  p^  ■  0.15, 

for  comparison  with  analogous  Poisson  and  binomial  cases.  The  power  of 
the  "approximate"  teat  (17)  for  the  Jointly  dependent  hyper geometric 
distribution  case  is  0.409,  which  exceeds  the  comparable  values  in 
Table  7.  Because  of  the  extensive  calculations  required  and  since  p^ 
and  p^  are  generally  small  so  that  the  Poisson  approximation  holds,  the 
power  functions  of  the  "approximate"  test  for  other  dependent  hyper- 
geometric  distributions  were  not  computed. 

The  conditional  probability  of  getting  the  result  d  ,d  ,  under  the 
null  hypotheses  of  p^  =  p^  :  p^,  is  obtained  by  dividing  (30)  by  (31), 
the  expression  for  the  probability  of  d^; 


which  yields  upon  simplification 
(32) 


n  .'n  .'d  .'  (n  -d  )j 
8  c  t  t  t 


d  .'d  (n  -d  ).'(n  -d  ),'n 

8C  88  CC  C 

It  will  be  noted  this  representation  of  the  conditional  probability 
for  d  ,d  ,  is  identical  to  the  expression  obtained  by  application  of 
Fisher's  "exact"  test  to  the  results  of  Table  1. 

d.  Estimating  Product  Quality  -  Results  of  acceptance  saiqpling 
results  can  provide  estimates  of  the  over-all  quality  of  the  lots 
accepted  and  rejected  by  a  sampling  plan  but  not  of  the  segregated 


24 


portions.  Coosldsr  the  spsclnl  case  vbere  Lot  qusllty  if  binonlally- 
controlled.  From  «  theoren  by  Hood  [27]  ,  if  the  distribution  is 
binomisl  with  psrsmeter  p*.  then  Che  number  of  defectives  in  the  ssmples 
end  in  the  remsining  pert  of  Che  lots  ere  independently  end  binooLislly 
distributed  with  the  ssme  persneter  p'.  We  would  therefore  expect  that 
the  proportion  defective  in  Che  remsining  parts  of  lots  rejected  by  an 
acceptance  plan  would  be  equal  to  Che  proportion  defective  in  the 
remaining  parts  of  lots  accepted  by  an  acceptance  plan.  But  acceptance 
sampling  results  give  lot  quality  estimates  of  Che  fraction  of  production 
rejected,  in  terms  of  proportion  defective,  which  are  generally  higher 
than  Che  lot  quality  estimates  of  the  fraction  of  product  accepted  by 
the  sampling  plan. 

This  bias  is  evident  if  we  consider  Che  effect  of  Che  OC  curve  of 
a  single  sssapling  plan 

(33)  Lp*  5  1  fg)  q*  "-<*  p"* 

d  ■  o  '  ' 

where  Lp*  denotes  the  probability  of  acceptance  of  lots  produced  by  a 
binaad.ally**controlled  process  with  parameter  p*.  Assume  n  and  c  have 
been  chosen  so  that  o  <  Lp'  <  1.  All  accepted  lots  will  be  charac¬ 
terized  by  d  ^  c  and  all  rejected  lots  by  d  c  .  It  is  clear  that 

n  n  n  n 

the  estimates  of  lot  quality  afforded  by  Che  acceptance  sampling  results 
do  not  reflect  the  true  quality  p*. 

Product  verification  sampling,  being  independent  of  the  supplier's 
acceptance  sampling  system,  can  be  used  for  evaluating  the  true  quality 
of  the  supplies  offered  for  consumer  acceptance.  Moreover,  the 
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validation  sampLing  results  serve  to  check  the  supplier's  reported 
process  average  for  control  of  reduced,  normal  or  tightened  Inspection 
under  MIL-STD  105. 

10.  STATISTICAL  CRITERIA  FOR  PAlKiiD  ATTRIBUTE  SAMi*LlNOS.  Paired 
attribute  sampling  results  can  be  conveniently  tested  for  statistical 
significance  by  means  of  tables  providing  critical  values  for  the 
homogeneity  tests  described  for  Poisaon  variates.  For  a  given  number 
of  total  defectives,  d^,  observed  in  the  supplier's  and  consumer's 
samples,  when  Q  is  specified,  limits  can  be  set  for  cither  d^  or  d,. 

This  arrangement  enumerates  the  boundary  points  of  the  critical  region 
of  the  test.  However,  the  consumer  usually  desires  to  compare  his  sample 
results,  associated  with  the  sample  results  recorded  by  the  supplier, 
against  a  "rejection  number." 

Table  I  of  DoD  Handbook  H109,  included  in  the  Appendix  of  this 
paper,  was  derived  from  (17).  It  sets  forth  an  action  number,  depending 
on  r,  for  each  value  of  d^,  which  may  be  recorded  by  the  supplier.  When 
an  action  number,  denoted  by  d^(A),  is  reached  or  exceeded,  the  consumer 
adopts  a  course  of  action  on  the  premise  that  a  discrepancy  actually 
exists  in  the  supplier's  inspection  system.  Tables  lA  and  I£,  derived 
from  (8)  and  (19),  respectively,  are  alternate  sets  of  critical  values 
included  in  the  Appendix  for  comparison  with  Table  1.  The  critical 
values  of  these  tables  correspond  to  Cl  -  0.05  for  a  one-sided  test. 

The  probability  integral  transformation  of  Table  5  or  Table  6,  for 
a  given  readily  evaluated  from  Tables  of  the 

Incomplete  Beta-Function  Ratio  [29]  and  natural  logarithm  tables. 
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Table  II  of  DoD  Handbook  H109,  subdivided  into  five  sections 

corresponding  to  r  values  1,  2,  3,  5  and  8,  respectively,  yields  directly 

the  probability  integral  transformations,  reduced  by  or.e»haI£,  derived 

from  "approximate"  tests  (17)  of  Poisson  variates.  These  values, 

designated  as  "check  ratings,"  when  doubled  are  approxisutely  diatrib- 
2 

uted  as  X  with  2  degrees  of  freedoau 

Table  III  of  DoD  Handbook  HI09  ia  a  modified,  extended  table  of  the 

2 

percentage  points  of  the  X  distribution  for  even-numbered  degrees  of 

freedom.  As  Table  III  is  used  in  conjunction  with  Table  II,  the  critical 

2 

values  tabulated  are  ^  X  2k  ^  ^  degrees  of  freedom,  where  k  is  the 
number  of  probabilities  to  be  combined,  l.e.,  number  of  lots  verified. 

The  warning  and  action  limits  in  Table  111  have  been  set  at  the  0.03  and 
0.01  significance  levels,  respectively,  and  the  median  value  at  the  0.30 
level. 

The  accumulation  of  check  ratings  s*.rves  to  summarise  all  available 
sampling  data  bearing  on  the  reliability  of  the  supplier's  inspection 
results.  Furthermore,  the  ratings  establish  an  objective  degree  of 
confidence  in  the  relative  accuracy  of  the  supplier's  results  of 
sampling  inspection.  In  this  connection,  the  following  graphical  derlce 
may  be  used  to  show  homogeneity  of  the  paired  sampling  results:  Using 
semi- logarithmic  graph  paper,  plot  the  check  ratings  obtained  from 
Table  II  of  DoD  Handbook  HI09  on  the  logarithmic  scale  against  the 
ordinal  number  of  the  test  on  the  arithmetic  scale.  Superimpose  on  the 
chart  two  horizontal  lines  corresponding  to  the  median  and  100  Q7, 
levels,  respectively,  of  the  ^/2  value  for  2  degrees  of  freedom.  Any 
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pronounced  runs  above  or  below  the  median  line,  or  marked  divergence 
about  the  100  OLX  line,  will  indicate  the  likelihood  of  an  asaignable 
cause  at  work,  which  should  be  investigated. 

Table  IV  of  DoD  Kandbook  was  derived  frum  (2U)  where  (1  -  y  ) 
represents  the  probability  of  acceptance  of  the  null  hypothesis  of 
homogeneity  on  the  basis  of  a  single  trial.  The  probability  of 
acceptance  of  the  hypothesis  of  homogeneity  and  not  taking  action  in 
K  trials  is  designated  by  (1  •  (  ).  Table  IV  emphasizes  the  continuing 
nature  of  the  test  for  inspection  concordance  and  is  useful  for  augment¬ 
ing  the  power  of  a  single  test  for  homogeneity. 

The  power  of  the  "approximate"  test  (17)  for  Poisson  variates  given 
in  Table  3  of  this  paper  furnished  the  basis  for  Table  V  of  DoD  Handbook 
H109.  The  latter  shows  the  probability  of  the  failure  to  reject 
different  alternate  hypotheses.  This  probability  of  a  Type  II  error  is 
also  depicted  in  DoD  Handbook  H109  by  an  operating  characteristics  (OC) 
curve.  The  set  of  OC  curves  provided  in  the  handbook  are  applicable 
for  sample  size  ratios  1,  2,  3,  3  and  8  and  nuisance  parameters  p'n 

O  B 

generally  encountered  in  practice. 

The  OC  curves  used  in  conjunction  with  Table  IV  are  useful  in 
determining  the  size  of  the  validation  sample  for  a  single  trial.  The 
level  of  significance  Q  at  which  a  test  is  to  be  conducted  was  pre¬ 
determined  as  0.03.  The  alternative  that  we  wish  to  protect  against  and 
the  risk  that  we  are  willing  to  take  of  making  a  Type  II  error  need  to 
be  determined  by  the  consumer.  The  OC  curves  will  then  show  what  sample 
size  will  satisfy  the  two  conditions. 
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11.  CO'JCLUDI ,’Ki  KKMAKKS.  The  system  of  sampling  inspection  imposed 


upon  the  supplier  oncrates  to  assure  a  product  meeting  specified  quality 
standards.  Witli  tije  supplier's  size  of  sample  fixed  by  the  acceptance 
plan,  the  size  ui  the  validation  sample  can  be  varied  by  tt.e  consumer 
subject  to  mathemat lea  I  rules  involving  considerations  of  sampling  risks, 
etc.  But  Its  adjustment  can  also  be  based  on  external  evidence  that  the 
supplier  is  maintaining  an  acceptable  quality  control  and  inspection 
system,  or  that  inspection  /^ids  are  properly  calibrated  and  used. 
Nevertheless,  confirming  data,  generated  by  inspection  of  a  portion  of 
the  product  by  the  consumer,  is  sine  qua  non. 

The  validation  sample  used  on  an  Individual  or  skip-lot  basis  can 
furnish  an  estimate  of  the  quality  of  product  offered  for  acceptance. 

As  this  quality  stabilizes  at  an  acceptable  level  the  consumer  may  step¬ 
wise  shift  to  a  smaller  verification  sample.  At  each  stage  he  may 
consult  the  power  function  or  OC  curve  oi  Che  significance  test  to 
determine  his  risks.  Conversely,  with  sufficient  statistical  sophistica¬ 
tion,  the  consumer  can  select  a  sample  size  ratio  based  on  the  power  of 
test  and  the  risk  of  not  detecting  an  Inspection  discrepancy  of 
importance  within  a  predetermined  number  of  trials. 

These  techniques  can  be  extended  to  provide  a  coat  basis  for  decid¬ 
ing  upon  the  size  of  thr  validation  sample.  However,  the  objective  of 
this  paper  and  DoD  Handbook  H109  is  to  provide  a  system  of  product 
verification  inspection  which  can  be  initiated  and  applied  by  a  field 
inspector  with  the  training  in  statistics  that  he  already  has.  For  this 
reason,  the  approach  oi  classical  statistics  is  used  with  predetermined 
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levels  of  significance.  Tables  of  critical  values  are  provided  to  avoid 
computation  and  the  techniques  are  almplltied  to  the  maxlmuia  extent  possible, 
without  substantial  loss  of  power  of  test.  Fortunately,  the  Poisson  expo¬ 
nential  limit  is  an  effective  approximation  to  the  binomial  and  hyper¬ 
geometric  distributions  cocinonly  met  In  industrial  practice. 

The  tests  of  homogeneity  for  Poisson  variates  described  in  this  paper 
are  sufficiently  robust  to  have  wide  utility.  They  are  appropriate  in 
many  practical  applications  where  mass  comparisons  of  attribute  data  are 
to  be  made  and  over-all  conclusions  are  to  be  drawn.  The  "exact"  test 
has  been  applied  for  this  purpose  to  long  tabulations  of  research  attribute 
data  (12).  However,  since  this  test  does  not  equalize  the  actual  size  and 
the  nominal  significance  level,  there  is  a  great  loss  of  power.  The 
"approximate"  test  and  its  alternates,  by  roaintaining  an  effective  level 
of  significance,  not  only  retain  their  power  but  may  be  combined  for  an 
over-all  test  of  a  common  hypothesis. 

The  choice  of  the  "approximate"  test  as  the  basis  for  product 
verification  was  determined  by  its  applicability  to  a  small  number  of 
events  without  loss  of  power.  Eiqplrical  trials  have  shown  this  method 
to  be  practically  as  powerful  as  the  randomization  procedure  described 
by  Tocher  [22J  and  Pearson  |23j  .  In  this  connection,  Lancaster  j^lbj 
states  that  it  is  plausible  to  consider  the  median  probability  test 
function  as  the  result  of  a  randomization  procedure  carried  out  before 
the  actual  trial.  However,  since  it  is  desirable  to  adopt  one  standard 
procedure  where  the  same  Judgment  is  always  made  on  the  same  data,  the 
"approximate"  procedure  was  selected  as  the  method  of  choice  for  product 
verification  inspection. 
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